A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines

نویسندگان

  • Sabato Marco Siniscalchi
  • Torbjørn Svendsen
  • Chin-Hui Lee
چکیده

A bottom-up, stepwise, knowledge integration framework is proposed to realize detection-based, large vocabulary continuous speech recognition (LVCSR) with a weighted finite state machine (WFSM). The WFSM framework offers a flexible architecture for different types of knowledge network compositions, each of them can be built and optimized independently. Speech attribute detectors are used as an intermediate block to obtain phoneme posterior probabilities over which a phoneme recognition network is designed. Lexical access and syntax knowledge integration over this phoneme network are then performed to deliver the decoded sentences. Experimental evidence illustrates that the proposed system outperforms several hybrid HMM/ANN systems with different configurations on the Wall Street Journal task while it is competitive with conventional LVCSR technology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Incremental Language Models for Speech Recognition Using Finite-state Transducers

In the context of the weighted finite-state transducer approach to speech recognition, we investigate a novel decoding strategy to deal with very large n-gram language models often used in large-vocabulary systems. In particular, we present an alternative to full, static expansion and optimization of the finite-state transducer network. This alternative is useful when the individual knowledge s...

متن کامل

A Brief Overview of Decoding Techniques for Large Vocabulary Continuous Speech Recognition

A number of decoding strategies for large vocabulary speech recognition are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by M-gram LMs and cross-word phonetic contexts. This study is articulated along two main axes, namely, the network expansion and the...

متن کامل

Spoken Language Processing Using Weighted Finite State Transducers

The main goal of this paper is to illustrate the advantages of weighted finite state transducers (WFSTs) for spoken language processing, namely in terms of their capacity to efficiently integrate different types of knowledge sources. We shall illustrate their applicability in several areas: large vocabulary continuous speech recognition, automatic alignment using pronunciation modeling rules, g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011